-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Collisionless containers (DO NOT MERGE) #217
WIP: Collisionless containers (DO NOT MERGE) #217
Conversation
Data/HashMap/Base.hs
Outdated
showString "fromList " . shows (toList m) | ||
-- instance (Show k, Show v) => Show (HashMap k v) where | ||
-- showsPrec d m = showParen (d > 10) $ | ||
-- showString "fromList " . shows (toList m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This instance must be restored!
denial of service exploit. However, we can use the fact that `hashable` supports | ||
salted hashing by default to alleviate at least some of this problem. In the | ||
current version, insertion time is not `O(number of collisions)` but `O(number | ||
of consequtive levels of collision)`. While this is not better in case all the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/consequtive/consecutive/
Hi @NorfairKing! Since @emilypi and I have joined @treeowl and @tibbe as maintainers of this package, we've been discussing what to do about the HashDoS potential and what to make of this PR. I currently think it's very intriguing, others are more sceptical. So far the focus of this package and http://hackage.haskell.org/package/hashable-1.3.0.0/docs/Data-Hashable.html#g:1 But as you are fully aware, some users still use So if your approach can sufficiently reduce the HashDoS potential without reducing In any case, we have lots of questions. I'll start with my own:
|
@treeowl There are a lot of open "conversations" resulting from previous reviews of yours. Could you possibly hide those that have been resolved, by clicking on "Resolve conversation" for those? |
The approach is new-ish. See https://stackoverflow.com/questions/53495133/is-this-approach-to-dealing-with-hash-collisions-new-unique
Yes, I have >30K collisions in a ~1MiB json object that can keep a json-parsing server busy for minutes. I've already shared this with tibbe and treeowl.
We did the benchmarks as well and saw no difference. That makes sense because this implementation only differs when there are (many) conflicts. If I remember correctly it's in the package that I sent to treeowl and tibbe.
Well the pr needs to be rebased on top of something more recent, but all the tests still passed when I last did that. |
Many thanks for the quick response, @NorfairKing! :) For now, I'd just like to let you know that our internal discussion is progressing further, and I think we'll be able to give you more feedback within in the next few weeks. |
Thanks for picking this up. I 'm curious to hear more but I am no longer being payed to work on this so I will not be able to justify spending a lot of time on this. |
Out of curiosity: Have you ever considered using
|
@sjakobi I haven't. We did this work because |
@NorfairKing Is there a reason this is still WIP and DO NOT MERGE? @sjakobi The "internal discussion" seems to have gotten lost in the shuffle a year back. A timely fix would be good at this point. Can we merge? |
There's still considerable debate about this. Johan Tibell has said no, and
he knows more about this package than anyone.
…On Sat, Sep 11, 2021, 2:54 PM Thomas M. DuBuisson ***@***.***> wrote:
@NorfairKing <https://github.com/NorfairKing> Is there a reason this is
still WIP and DO NOT MERGE?
@sjakobi <https://github.com/sjakobi> The "internal discussion" seems to
have gotten lost in the shuffle a year back. A timely fix would be good at
this point. Can we merge?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7LGIXFZQTFLRUPDKADUBOQXJANCNFSM4F3ORDLA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Is there a link to that discussion so we can hear the objections first hand? As an open source project it would be traditional to have the objection available in the code review. |
I'm not sure. There was a lot of private discussion and a lot of things
have fallen out of my cache. I'm sure he'd be happy to comment publicly.
…On Sat, Sep 11, 2021, 3:00 PM Thomas M. DuBuisson ***@***.***> wrote:
Johan Tibell has said no
Is there a link to that discussion so we can hear the objections first
hand? As an open source project it would be traditional to have the
objection available in the code review.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#217 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7KYTZ3UDNUZZH6XB43UBORMJANCNFSM4F3ORDLA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
This was a private conversation as part of the responsible disclosure procedure. |
@NorfairKing You suggested this doesn't change performance, where are the benchmarks to show that? |
Yes, they were included in the original communication, but you can also just try to run the benchmarks locally to see. Like I said:
|
@NorfairKing thanks for your work and investigation and this PR, nicely done! @TomMD it seems a bit unfair to mount pressure on maintainers of I guess from pragmatic perspective it would be much simpler to replace |
@NorfairKing Thanks for the publishing this great write-up, and apologies for responding so slowly to your PR! I'm happy that, now that the vulnerability is properly published, we can discuss it more easily in public. I have opened #319, so we can discuss possible responses to the vulnerability separately from the proposed fix that you have provided in this PR. I invite anyone interested to participate in that discussion! |
Unfortunately we (at least @tibbe) are not convinced that this PR is of much help with preventing collision attacks. Citing #319 (comment):
This is no final judgment on this PR, but I don't currently expect that this PR will be part of a "timely fix" in If you're looking for potential mitigating measures, see #319 (comment) and #319 (comment). |
I agree; this is a fair bit of extra complexity and maintenance burden for something we don't actually know for sure is helpful. |
Hopefully this a helpful argument: https://medium.com/@robertgrosse/generating-64-bit-hash-collisions-to-dos-python-5b21404a5306 even Robert (the author of that blog post) doesn't share the universal collision sets (i.e. they collide for all salts), he is very far in showing they are feasible to construct. FNV-1 is just a weak hash. Thus having hashmap with another hashmap for collision resolution is potentially very bad, as that chain simply will not terminate: It's enough to have just two keys which collide universally! Collision resolution method should be vastly different then reusing (weak) hash function. |
Alright, that convinced me. Closing this for good. |
That's a very good insight, @phadej. I hadn't realized this earlier. |
No description provided.